An Exploration of Pursuing Professional Explorations

My Data Science Job Search: August - October, 2020

Rohan Lewis

2020.09.29 (Updated 2020.10.06)

I began applying for Information Technology jobs on August 11th, 2020. I used LinkedIn as my primary source of information. I varied between large and small companies, EasyApply and normal applications via company job application portals or website submissions, and various locations across the US. I even applied to a job in Sydney, Australia (by accident) and a job in Toronto, Canada.

I saved my application information in an Excel Spreadsheet as I applied. I decided to use it to practice some visualization techniques.

In [1]:
#General Packages.
from datetime import date
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import re

#Specific Packages
from wordcloud import WordCloud
import bar_chart_race as bcr
from IPython.display import Video
import plotly.graph_objects as go
from pywaffle import Waffle

#Read data.
df = pd.read_excel(r'D:\Data Science\Completed\Jobs\Jobs.xlsx')
lat_lons = pd.read_excel(r'D:\Data Science\Completed\Jobs\uscities.xlsx')

#Convert datetime columns to date.
df['Date_Applied'] = df['Date_Applied'].apply(lambda x: x.date())
df['Rejection_Email'] = df['Rejection_Email'].apply(lambda x: x.date())
df['Viewed_Email'] = df['Viewed_Email'].apply(lambda x: x.date())

#Current Numbers
print("As of " + date.today().strftime("%A, %B %d, %Y") + ", I have applied to " + str(df.shape[0]) + " jobs.")
As of Monday, October 12, 2020, I have applied to 509 jobs.

Job Title

This section looks at the specific job titles.

  1. I eliminated all non alphabetical characters.

  2. A recurring theme in job titles was numbering, such as "Data Analyst II" or "Data Scientist III". Since I had already eliminated numbers, I needed to eliminate Roman Numerals (only I, II, and III).

  3. I had some false negatives, so I fixed those.
    1. "AR VR" was changed back to "AR/VR" for one title.
    2. "C C" was changed back to "C2C" for one title.
    3. "Microsoft" was changed to "Microsoft-365" for one title.
    4. "Non IT" was changed back to "Non-IT" for one title.

  4. Words were then tallied.

Word Frequencies

Below are the top 10 and bottom 10 (some of the single occuring) words from the job titles.

In [2]:
#Combine all job titles into one string.
jobs_string = ' '.join(df['Title'])
#Only letters are useful.
regex = re.compile('[^a-zA-Z]')
#Remove all non letters, and remove ' I ', ' II ', ' III '.
jobs_string = regex.sub(' ', jobs_string).replace(' I ',' ').replace(' II ',' ').replace(' III ',' ')
#Specific Replacement.
jobs_string = jobs_string.replace('AR VR','AR/VR').replace('C C', 'C2C').replace('Microsoft', 'Microsoft-365').replace('Non IT', 'Non-IT')

#Create a frequency distribution of all words in the job titles.
jobs_dict = {}
jobs_words = jobs_string.split()
for word in jobs_words :
    if word not in jobs_dict.keys() :
        jobs_dict[word] = 1
    else :
        jobs_dict[word] += 1

#Convert frequency distribution to dataframe, sort by frequency.
jobs_df = pd.DataFrame({'Word' : list(jobs_dict.keys()),
                       'Count' : list(jobs_dict.values())}).sort_values(by = 'Count', ascending = False, axis = 0).reset_index(drop = True)
jobs_df.head(10)
Out[2]:
Word Count
0 Data 462
1 Scientist 288
2 Analyst 123
3 Engineer 72
4 Machine 49
5 Learning 49
6 Analytics 22
7 Science 21
8 Developer 18
9 Research 17
In [3]:
jobs_df.tail(10)
Out[3]:
Word Count
179 AWS 1
180 C2C 1
181 Delphi 1
182 Option 1
183 Sponsored 1
184 Products 1
185 Microsoft-365 1
186 Sales 1
187 healthcare 1
188 Revport 1

Word Cloud

I thought a word cloud would be a fun representation to look at the job titles. The proportional size has been rescaled.

In [4]:
jobs_wc = WordCloud(background_color = 'white',
                    max_words = 300,
                    collocations = False,
                    relative_scaling = 0)
jobs_wc.generate(jobs_string)
plt.figure(figsize = (15, 11))

plt.imshow(jobs_wc, interpolation = 'bilinear')
plt.axis('off');

Companies

Some companies are hiring heavily. Some are recruiting and staffing agencies for others.

Once my application was in a system on a particular company's career portal page, it was easy to reapply. I used this quite a bit for companies like Amazon, Google, MITRE, and PayPal.

I used LinkedIn's EasyApply for many applications as well.

For the others, I sometimes applied as a guest, sometimes only had to upload my resume and cover, sometimes had to go through a 20 minute ordeal just for one opening. It varied. ¯_( ͡° ͜ʖ ͡°)_/¯

Company Applications by Date

See Appendix for full table of cumulative applications by company and date, sorted alphabetically and chronologically, respectively.

In [5]:
#List of all companies.
companies = df['Company'].unique()
#Dates from first application to today.
date_index = pd.date_range(start = min(df['Date_Applied']), end = date.today())

#Create new data frame of 0s.
application_df = pd.DataFrame(index = date_index, columns = companies).fillna(0)

#Create cumulative count of job applications by company and date.
for i in range(len(df)) :
    company = df.iloc[i, 1]
    date_app = df.iloc[i, 7]
    application_df.loc[date_app:, company] += 1

#Alphabetical
application_df = application_df.reindex(sorted(application_df.columns), axis=1)
In [6]:
#Total number of applications to each company.
cumulative_app_count = application_df.iloc[[-1]]
major_companies = cumulative_app_count.columns[(cumulative_app_count >= 5).iloc[0]]
minor_companies = cumulative_app_count.columns[(cumulative_app_count < 5).iloc[0]].tolist()

#Create a dummy company called 'Other', containing all companies with less than 4 applications.
major_df = pd.DataFrame.copy(application_df)
major_df['Other'] = major_df[minor_companies].sum(axis = 1)
major_df.drop(minor_companies, axis = 1, inplace = True)

#Check number of bars to include.
#major_df.shape

Bar Chart Race

Below is an animation of the above data. Bar chart races run more smoothly with larger numbers, like population, or monetary amounts, over longer periods of time. But I am happy the way this turned out.

In [7]:
bold_colors = ['#f0f0f0', '#3cb44b', '#e6194b', '#fffac8', '#9a6324', '#e6beff', '#fabebe', '#000075',
               '#ffe119', '#008080', '#4363d8', '#ffffff', '#bcf60c', '#46f0f0', '#911eb4', '#800000',
               '#f032e6', '#808000', '#ffd8b1', '#f58231', '#aaffc3', '#000000', '#ca1699']

apps = bcr.bar_chart_race(df = major_df,
                          filename = 'applications.mp4',
                          orientation = 'h',
                          sort = 'desc',
                          n_bars = 15,
                          cmap = bold_colors[0:22],
                          filter_column_colors = False,
                          period_fmt = '%B %d, %Y',
                          period_label = {'x': 0.99,
                                          'y': 0.155,
                                          'ha': 'right',
                                          'size': 14},
                          period_summary_func = lambda v, r: {'x': 0.99,
                                                              'y': 0.05,
                                                              's': f"{v.sum():,.0f} applications completed.\nCompanies with less than 4 applications are grouped in 'Other'.",
                                                              'ha': 'right',
                                                              'size': 9,
                                                              'weight': 'normal'},
                          title = 'Total Number of Jobs Applied to by Company and Date',
                          steps_per_period = 10)

Video("applications.mp4")
Out[7]:

Job Location

Some slight modifications were made from LinkedIn data during the application process:

  1. For Curate Partners, a job location was changed from Raleigh-Durham-Chapel Hill Area to Raleigh.
  2. For Parker & Lynch, a job location was changed from Orange County to Los Angeles.
  3. For Synectics, a job location was changed from Greater Chicago to Chicago.
  4. For several companies, job locations were changed from "Greater" or "Metropolitan" of a city to that city.

In addition, for Common App, the job location was changed from none specified to Arlington. I did not learn about their opening from LinkedIn.

By City

Remote Locations

Several jobs were advertised with no city, only remote.

In [8]:
df[df['City'] == "Remote"]
Out[8]:
Title Company Size City State_abbv State Date_Posted Date_Applied Rejection_Email Viewed_Email CoID JobID URL
192 Data Scientist Pilot Flying J 10001+ Remote NaN NaN 2020-09-04 00:00:00 2020-09-06 2020-09-17 NaT NaN 2.019452e+09 https://www.linkedin.com/jobs/search/?currentJ...
413 Forward Deployed Data Scientist Cresta 11-50 Remote NaN NaN 2020-09-29 00:00:00 2020-09-30 NaT NaT NaN 2.155368e+09 https://www.linkedin.com/jobs/search/?currentJ...
420 Data Scientist White Ops 51-200 Remote NaN NaN > 1 week 2020-09-30 NaT NaT NaN 2.023661e+09 https://www.linkedin.com/jobs/search/?currentJ...
425 Platform Data Engineer Demyst 51-200 Remote NaN NaN 2020-10-01 00:00:00 2020-10-01 NaT NaT NaN 2.152177e+09 https://www.linkedin.com/jobs/search/?currentJ...
437 Intern - AI/Machine Learning Seagate Technology 10001+ Remote OR Oregon 2020-09-30 00:00:00 2020-10-01 NaT NaT 201783 2.183372e+09 https://www.linkedin.com/jobs/search/?currentJ...

I retrieved their office locations and manually entered them.

In [9]:
#Manual City, State_abbv, and State entry. 
dict_cs = {192: ("Knoxville", "TN", "Tennessee"),
           413: ("San Francisco", "CA", "California"),
           420: ("New York", "NY", "New York"),
           425: ("New York", "NY", "New York"),
           437: ("Portland", "OR", "Oregon")}

#Loop to manually enter missing City, State_abbv, and State entry for Remote locations.
for idx in dict_cs :
    temp_city = dict_cs[idx][0]
    temp_state_abbv = dict_cs[idx][1]
    temp_state = dict_cs[idx][2]
    df.at[idx, 'City'] = temp_city
    df.at[idx, 'State_abbv'] = temp_state_abbv
    df.at[idx, 'State'] = temp_state

Missing Cities

Several jobs I applied to were in cities not in the spreadsheet I downloaded.

In [10]:
#Select relevant columns.
df_loc = df[['City', 'State_abbv', 'State', 'Date_Applied']]

#Select only city, state, and location columns.
lat_lons = lat_lons[['city', 'state_id', 'lat', 'lng']]

#Count the number of applications.
city_tally = df_loc.groupby(['City', 'State_abbv', 'State']).count().reset_index()
#Merge to get latitude, longitude for each city.
merged = pd.merge(city_tally,
                  lat_lons,
                  how = 'left',
                  left_on = ['City', 'State_abbv'],
                  right_on = ['city', 'state_id'])



#Several cities are not in the list.
merged[merged['city'].isna()]
Out[10]:
City State_abbv State Date_Applied city state_id lat lng
8 Bedford MA Massachusetts 4 NaN NaN NaN NaN
21 Bridgewater NJ New Jersey 1 NaN NaN NaN NaN
27 Center Valley PA Pennsylvania 1 NaN NaN NaN NaN
43 Dallas - Ft. Worth TX Texas 5 NaN NaN NaN NaN
88 Patuxent River MD Maryland 1 NaN NaN NaN NaN
121 Sydney AU Australia 1 NaN NaN NaN NaN
124 Toronto CN Canada 1 NaN NaN NaN NaN
132 Weston MA Massachusetts 1 NaN NaN NaN NaN

Manual Removal and Entry

Two were from Australia and Canada. For the others, I retrieved values from Google. An approximate average value was chosen for Dallas-Ft. Worth.

In [11]:
#Huxley job from Sydney, Australia was removed.
merged = merged[merged.State_abbv != 'AU']
#Prodigy Academy job from Toronto, Canada was removed.
merged = merged[merged.State_abbv != 'CN']

#Manual latitude and longitude entry. 
dict_loc = {8: (42.4906, -71.2760),
            21: (40.5940, -74.6049),
            27: (40.5294, -75.3937),
            43: (32.7598, -97.0646),
            88: (38.2773, -76.4229),
            132: (42.3668, -71.3031)}

#Loop to manually enter missing latitude and longitudes.
for idx in dict_loc :
    lat = dict_loc[idx][0]
    lon = dict_loc[idx][1]
    merged.at[idx, 'lat'] = lat
    merged.at[idx, 'lng'] = lon

#Rename columns and drop redundant columns    
merged = merged.rename(columns = {'Date_Applied' : 'Count',
                                  'lat': 'Latitude',
                                  'lng': 'Longitude'}).drop(['city', 'state_id'], axis = 1)

Mountain View

My left join was 11 off. I soon determined that two cities in California are named Mountain View.

In [12]:
merged[merged.City == 'Mountain View']
Out[12]:
City State_abbv State Count Latitude Longitude
78 Mountain View CA California 11 37.4000 -122.0796
79 Mountain View CA California 11 38.0093 -122.1169

I removed the one not in the Bay Area. All cities, with their state, number of applications, and location, are displayed below.

In [13]:
merged = merged[(merged['Latitude'] != 38.0093) | (merged['Longitude'] != -122.1169)]
merged
Out[13]:
City State_abbv State Count Latitude Longitude
0 Andover MA Massachusetts 1 42.6554 -71.1418
1 Arlington VA Virginia 6 38.8786 -77.1011
2 Ashburn VA Virginia 1 39.0300 -77.4711
3 Asheville NC North Carolina 1 35.5704 -82.5536
4 Atlanta GA Georgia 17 33.7627 -84.4224
... ... ... ... ... ... ...
130 Wellesley MA Massachusetts 6 42.3043 -71.2855
131 West Menlo Park CA California 2 37.4338 -122.2034
132 Weston MA Massachusetts 1 42.3668 -71.3031
133 Wilmington DE Delaware 1 39.7415 -75.5413
134 Woonsocket RI Rhode Island 1 42.0010 -71.4993

132 rows × 6 columns

By State

See Appendix for frequency by state.

In [14]:
#Group job applications by State, count them, reset the index, drop the date, and rename City to Count.
state_tally = merged[['Count', 'State_abbv', 'State']].groupby(['State_abbv', 'State']).sum().sort_values(by = 'Count', ascending = False, axis = 0).reset_index()

US Map

Below is an interactive map of the US. Applications are sorted by city and state.

In [15]:
#Singular or Plural.
def f(row) :
    if row['Count'] == 1 :
        string_val = ' application in '
    else :
        string_val = ' applications in '

    return string_val

#Number of applications per city as text.
merged['Text'] = merged['Count'].astype(str) + merged.apply(f, axis = 1) + merged['City'] + ', ' + merged['State_abbv'] + '.'

#Number of applications per state as text.
state_tally['Text'] = state_tally['Count'].astype(str) + state_tally.apply(f, axis = 1) + state_tally['State'] + '.'

#Color in states by number of applications.
state_map_data = go.Choropleth(locations = state_tally['State_abbv'],
                               z = state_tally['Count'],
                               text = state_tally['Text'],
                               hoverinfo = 'text',
                               locationmode = 'USA-states',
                               colorbar = {'title': "<b>Applications</b>",
                                           'thicknessmode': "pixels",
                                           'thickness': 70,
                                           'lenmode': "pixels",
                                           'len': 400,
                                           'titlefont': {'size': 16},
                                           'tickfont': {'size': 12},
                                           'tickvals': [0, 20, 40, 60, 80, 100]},
                               colorscale = 'Blues')

#Plot cities, size corresponds to number of applications.
city_map_data = go.Scattergeo(lon = merged['Longitude'],
                              lat = merged['Latitude'],
                              text = merged['Text'],
                              hoverinfo = 'text',
                              locationmode = 'USA-states',
                              marker = {'size': 10*np.sqrt(merged['Count']),
                                        'color': 'Darkgreen'})

data = [state_map_data, city_map_data]

fig = go.Figure(data = data)
fig.update_layout(title = {'text': 'Where I Have Applied (Hover for Cities and Count)',
                           'font': {'size': 30}},
                  geo_scope = 'usa',
                  width = 950,
                  height = 550)

Waffle Plot

Below is the distribution of applications by state.

In [16]:
#Sort by number of applications for each state.
waffle_data = state_tally[['State', 'Count']]

#Add a row for states with less than 3 applications.  'Other' also includes Australia and Japan.
waffle_data = waffle_data.append(pd.DataFrame({'State': ['Other'], 'Count': [2]})).reset_index(drop = True)

to_drop = []
#Add applications from states with less than 4 to 'Other'.
for i in waffle_data.index :
    if waffle_data.iloc[i]['Count'] < 4 :
        temp = waffle_data.iloc[i]['Count']
        waffle_data.at[33, 'Count'] += temp
        to_drop.append(i)

#Remove states with less than 3.  Change orientation of data.  
waffle_data = waffle_data.drop(labels = to_drop, axis = 0).reset_index(drop = True).set_index('State').transpose()
waffle_data.shape
Out[16]:
(1, 20)
In [17]:
fig = plt.figure(FigureClass = Waffle,
                 rows = 17,
                 values = waffle_data.values.tolist()[0],
                 labels = waffle_data.columns.tolist(),
                 figsize = (14, 9),
                 colors = bold_colors[0:20],
                 legend = {'loc': 'upper left',
                           'ncol': 2,
                           'fontsize': 13})

Appendix

Company Applications by Date

In [18]:
pd.set_option('display.max_columns', None)
display(application_df)
AETEA Information Technology ALTEN AP Professionals ARYZTA Accelere Addison Group Addison Professional Financial Search LLC Aditi Consulting Advanced Auto Parts Age of Learning Alexander Technology Group Amazon American Bureau of Shipping Analytic Recruiting Inc. Apex Systems Artisan Talent Ascii Group, LLC Atlas Reasearch Austin Fraser Averity BCG Digital Ventures Bayside Solutions Bear Cognition Big Cloud BlueAlly Services BombBomb BrandMuscle Broadridge Brooksource Burtch Works CBTS CVS / Aetna CarMax CareHarmony Caterpillar CircleUp Cisco ClearBridge Technology Group Clever Devices Cloud9 Technologies, LLC Coit Group Collabera, Inc. Common App CompuGain CoreSite Cornerstone Staffing Solutions, Inc. Coursera Crawford Thomas Recruiting Cresta Critical Mass Curate Partners CybeCys, Inc. CyberCoders DMI DataLab USA Delphi-US, LLC Demyst Dick's Sporting Goods Diversant, LLC Dstillery EPITEC Edison Eliassen Group Enhance IT Entech Entelligence EpicTec Evernote Expedia FICO Fandango FedEx Fladger Associates FleetCor Technologies, Inc. Flexton Flywheel Digital ForgeRock Forrest Solutions Further Enterprise Solutions Gambit Technologies Gap Inc. GitHub Good Apple Google Greater New York Insurance Companies Greenphire Gsquared Group HIRECLOUT HP Harnham Havas Media Group Hays Hired by Matrix, Inc Hirewell Homesnap Horizon Media Horizontal Talent Hunter International Recruiting Huxley IBM IDR IDR, Inc. IQVIA ISO Ibotta Idexcel Illumination Works Innovative Systems Group Intelliswift Software, Inc. International Consulting Associates, Inc. JM Eagle JPI Jefferson Frank Jobot Jobspring Partners KGS Technology Group, Inc Kairos Living Kenshoo Komodo Health Kvaliro LevelUP LexisNexis LinkedIn LivePerson LockerDome M Science MITRE Macro Eyes Magnifi MaxisIT, Inc. McKinley Marketing Partners Media Assembly Meredith Corporation Mesh Recruiting, LLC Microsoft MindPool, Inc. Modis Moodys Northwest Consulting MotiveMetrics Mount Indie Next Insurance Ntelicor Nvidia OkCupid Olive Onebridge OpenArc, LLC. Optello Optomi PRI Technology Parker and Lynch Paro.io Patel Consulatants Pathrise PayPal Pilot Flying J Planet Pharma Prime Team Partners Proclinical Ltd. Prodigy Education Puls Pyramid Consulting, Inc. Quadrant Resource Radiansys, Inc. Randstad Real Rent-A-Center Reply Retail Solutions Inc. SBS Creatix, LLC Sand Cherry Associates Sanjay Bharti, MD PLLC Scale Media Scion Staffing Seagate Technology Selling Simplified Servpro Industries, LLC ShootProof SkyWater Search Partners Sogeti Sonder Inc. Susquehanna International Group, LLP Swift Strategic Solutions Inc. Synectics Inc. Synergis Systecon North America TRC Staffing Services, Inc. Tech Observer TechWorkers Technology Ventures Tencent The AI Institute The Equus Group The Home Depot The Jacobson Group The Judge Group The Lab Consulting Tiger Analytics Topco Associates LLC Toyoda Gosei Americas Ursus, Inc. Valassis Via Visible Walgreens White Ops Wimmer Solutions Wind River Wish Wonderlic WorldLink US Yoh, A Day & Zimmermann Company, LLC s.com zyBooks
2020-08-11 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2020-08-12 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2020-08-13 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
2020-08-14 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
2020-08-15 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2020-10-08 2 1 1 1 4 1 1 1 1 1 1 29 1 1 11 1 1 1 2 3 1 1 1 1 1 0 0 1 3 3 1 13 1 1 6 1 4 1 1 0 1 1 1 1 1 1 1 2 1 1 2 1 8 2 1 2 1 5 6 1 1 1 13 1 1 1 0 3 3 1 1 1 1 4 1 1 1 1 1 1 6 10 1 19 1 1 1 1 4 3 2 1 1 1 1 1 1 1 2 2 3 0 1 1 3 1 1 1 2 1 1 2 1 4 4 2 1 1 1 0 1 1 1 3 1 2 21 1 1 1 1 2 0 1 7 1 3 1 1 1 2 1 1 1 1 1 2 38 0 1 2 1 1 6 21 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 2 1 1 6 4 1 1 1 1 1 1 1 0 2 1 1 1 1 1 1 1 1 1 7 1 1 1 0 1 1 2 1 1 1 1 1 0 1 6 1 1 3 0 1
2020-10-09 2 1 1 1 4 1 1 1 1 1 1 29 1 1 11 1 1 1 2 3 1 1 1 1 1 1 1 1 4 3 1 13 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 8 2 1 2 1 5 6 1 1 1 13 1 1 1 0 3 3 1 1 1 1 4 1 1 1 1 1 1 6 10 1 19 1 1 1 1 4 3 2 1 1 1 1 1 2 1 2 2 3 0 1 1 3 1 1 1 2 1 1 2 1 4 4 2 1 1 1 0 1 1 1 3 1 2 21 1 1 1 1 2 0 1 7 1 3 1 1 1 2 1 1 1 1 1 2 38 0 1 2 1 1 6 22 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 2 1 1 6 6 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 7 1 1 1 0 1 1 2 1 1 1 1 1 0 1 6 1 1 3 1 1
2020-10-10 2 1 1 1 4 1 1 1 1 1 1 29 1 1 11 1 1 1 2 3 1 1 1 1 1 1 1 1 4 3 1 13 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 8 2 1 2 1 5 6 1 1 1 13 1 1 1 0 3 3 1 1 1 1 4 1 1 1 1 1 1 6 10 1 19 1 1 1 1 4 3 2 1 1 1 1 1 2 1 2 2 3 0 1 1 3 1 1 1 2 1 1 2 1 4 4 2 1 1 1 0 1 1 1 3 1 2 21 1 1 1 1 2 0 1 7 1 3 1 1 1 2 1 1 1 1 1 2 38 0 1 2 1 1 6 22 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 2 1 1 6 6 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 7 1 1 1 0 1 1 2 1 1 1 1 1 0 1 6 1 1 3 1 1
2020-10-11 2 1 1 1 4 1 1 1 1 1 1 29 1 1 11 1 1 1 2 3 1 1 1 1 1 1 1 1 4 3 1 13 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 8 2 1 2 1 5 6 1 1 1 13 1 1 1 0 3 3 1 1 1 1 4 1 1 1 1 1 1 6 10 1 19 1 1 1 1 4 3 2 1 1 1 1 1 2 1 2 2 3 0 1 1 3 1 1 1 2 1 1 2 1 4 4 2 1 1 1 0 1 1 1 3 1 2 21 1 1 1 1 2 0 1 7 1 3 1 1 1 2 1 1 1 1 1 2 38 0 1 2 1 1 6 22 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 2 1 1 6 6 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 7 1 1 1 0 1 1 2 1 1 1 1 1 0 1 6 1 1 3 1 1
2020-10-12 2 1 1 1 4 1 1 1 1 1 1 29 1 1 11 1 1 1 3 3 1 1 1 1 1 1 1 1 4 3 1 13 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 8 2 1 2 1 5 6 1 1 1 13 1 1 1 1 3 3 1 1 1 1 4 1 1 1 1 1 2 6 10 1 19 1 1 1 1 4 4 2 1 1 1 1 1 2 1 2 2 3 1 1 1 3 1 1 1 2 1 1 2 1 6 4 2 1 1 1 1 1 1 1 3 1 2 21 1 1 1 1 2 1 1 7 1 3 1 1 1 2 1 1 1 1 1 2 39 1 1 2 1 1 6 22 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 6 6 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 7 1 1 1 1 1 1 2 1 1 1 1 1 1 1 6 1 1 3 1 1

63 rows × 214 columns

State Frequency

In [19]:
state_tally
Out[19]:
State_abbv State Count Text
0 CA California 113 113 applications in California.
1 WA Washington 46 46 applications in Washington.
2 NY New York 44 44 applications in New York.
3 TX Texas 38 38 applications in Texas.
4 VA Virginia 34 34 applications in Virginia.
5 IL Illinois 28 28 applications in Illinois.
6 PA Pennsylvania 28 28 applications in Pennsylvania.
7 NC North Carolina 22 22 applications in North Carolina.
8 MA Massachusetts 21 21 applications in Massachusetts.
9 GA Georgia 18 18 applications in Georgia.
10 MD Maryland 17 17 applications in Maryland.
11 DC District Of Columbia 16 16 applications in District Of Columbia.
12 CO Colorado 15 15 applications in Colorado.
13 OH Ohio 8 8 applications in Ohio.
14 AZ Arizona 8 8 applications in Arizona.
15 MN Minnesota 8 8 applications in Minnesota.
16 CT Connecticut 7 7 applications in Connecticut.
17 NJ New Jersey 6 6 applications in New Jersey.
18 MI Michigan 6 6 applications in Michigan.
19 OR Oregon 3 3 applications in Oregon.
20 KY Kentucky 3 3 applications in Kentucky.
21 TN Tennessee 3 3 applications in Tennessee.
22 IN Indiana 3 3 applications in Indiana.
23 MO Missouri 2 2 applications in Missouri.
24 RI Rhode Island 2 2 applications in Rhode Island.
25 SC South Carolina 2 2 applications in South Carolina.
26 UT Utah 1 1 application in Utah.
27 AR Arkansas 1 1 application in Arkansas.
28 KS Kansas 1 1 application in Kansas.
29 ID Idaho 1 1 application in Idaho.
30 FL Florida 1 1 application in Florida.
31 DE Delaware 1 1 application in Delaware.
32 WV West Virginia 1 1 application in West Virginia.